NOTE: All the information is already provided to you. you can use previously provided information and files.

CS301 Data Structure
Assignment 4: Infix-Postfix Expression Calculator

Spring 2003
Instructor: Sohail Aslam

Assigned: November 4, 2003
Part-A due: November 10, 2003
Part-B due: November 15, 2003

The Assignment:
You will write a program that acts as a simple calculator, reading a file that contains an arithmetic expression on each line. The program evaluates these expressions and prints the values to cout. The assignment will be done in two parts.

Purposes:
Ensure that you can use the Stack template class that we provide for you.
Teach you how a program can use its command line parameters.
Teach you to read input from a file (rather than from cin).
Ensure that you understand and can implement the stack-based algorithms for evaluating an arithmetic expression.
Files that you must write:
  1. Calc.cpp: This is a file that contains your main program for this assignment, along with various other functions that you write for your main program to use. The purpose of the main program is to read a file in which each line contains an arithmetic expression. The program evaluates these expressions and prints their values to cout.

    The details of this main program are given in the sequence of exercises listed below.

Other files that you may find helpful:
  1. stack.h and stack.cpp: These are the header file and implementation file for the Stack template classes from. Reminder: Since this is a template class, it is not separately compiled. Instead, your main program simply includes "stack.cpp". With this include statement in place, the main program can declare and use any kind of stack.
  2. infix.dat and postfix.dat: These are input files that contain expressions in infox and postfix forms. You can add more expressions if you like.

 

Infix Expression to Post-Fix Expression

Conventional notation is called infix notation. The arithmetic operators appears between two operands. Parentheses are required to specify the order of the operations. For example: a + (b * c).

Post fix notation (also, known as reverse Polish notation) eliminates the need for parentheses. There are no precedence rules to learn, and parenthese are never needed. Because of this simplicity, some popular hand-held calculators use postfix notation to avoid the complications of multiple sets of parentheses. The operator is placed directly after the two operands it needs to apply. For example: a b c * +. This is true for calculators made by Hewlet-Packard (hp).

This short example makes the move from infix to postfix intuitive. However, as expressions get more complicated, there will have to be rules to follow to get the correct result:

Evaluating expression

A stack is used in two phases of evaluating an expression such as

                3 * 2 + 4 * (A + B)
  1. Convert the infix form to postfix using a stack to store operators and then pop them in correct order of precedence. 
  2. Evaluate the postfix expression by using a stack to store operands and then pop them when an operator is reached.

Infix to postfix conversion

Here are the steps involved in the conversion:

  1. Scan through an expression, getting one token (operator or operand) at a time.
  2. Fix a priority level for each operator. For example, from high to low:
    • level-4      -            (unary negation)
    • level-3      ^, %       (exponentiation, mod)
    • level-2      *, /         ( multiplication and division)
    • level-1      -,  +       ( subtraction and addition )
    Thus, high priority corresponds to high number in the table.
  3. If the token is an operand, do not stack it. Pass it to the output.
  4. If token is an operator or parenthesis, do the following:
    1. Pop the stack until you find a symbol of lower priority number than the current one. An incoming left parenthesis will be considered to have higher priority than any other symbol. A left parenthesis on the stack will not be removed unless an incoming right parenthesis is found. The popped stack elements will be written to output.
    2. Stack the current symbol.
    3. If a right parenthesis is the current symbol, pop the stack down to (and including) the first left parenthesis.
    4. Write all the symbols except the left parenthesis to the output (i.e. write the operators to the output).
    5. After the last token is read, pop the remainder of the stack and write any symbol (except left parenthesis) to output.

Example:

Convert A * (B + C) * D to postfix notation.

Move
Curren Ttoken
Stack Output
1
A
empty A
2
*
* A
3
(
(* A
4
B
(* A B
5
+
+(* A B
6
C
+(* A B C
7
)
* A B C +
8
*
* A B C + *
9
D
* A B C + * D
10

empty
Notes:
In this table, the stack grows toward the left. Thus, the top of the stack is the leftmost symbol.
In move 3, the left paren was pushed without popping the * because * had a lower priority then "(".
In move 5, "+" was pushed without popping "(" because you never pop a left parenthesis until you get an incoming right parenthesis. In other words, there is a distinction between incoming priority, which is very high for a "(", and instack priority, which is lower than any operator.
In move 7, the incoming right parenthesis caused the "+" and "(" to be popped but only the "+" as written out.
In move 8, the current "*" had equal priority to the "*" on the stack. So the "*" on the stack was popped, and the incoming "*" was pushed onto the stack.

Evaluating Postfix Expressions

Once an expression has been converted to postfix notation it is evaluated using a stack to store the operands.

  1. Step through the expression from left to right, getting one token at a time.
  2. Whenever the token is an operand, stack the operand in the order encountered.
  3. When an operator is encountered:
  4. At the end of the expression, the top of the stack will have the correct value for the expression.


Example:

Evaluate the expression 2 3 4 + * 5 * which was created by the previous algorithm for infix to postfix.

Move
Current Token
Stack (grows toward left)
1
2

2
2
3

3 2
3
4

4 3 2
4
+

7 2
5
*

14
6
5

5 14
7
*

70
Notes:
Move 4: an operator is encountered, so 4 and 3 are popped, summed, then pushed back onto stack.
Move 5: operator * is current token, so 7 and 2 are popped, multiplied, pushed back onto stack.
Move 7: stack top holds correct value.
Notice that the postfix notation has been created to properly reflect operator precedence. Thus, postfix expressions never need parentheses.

A Calculator Program
Discussion of the Assignment

This discussion will lead you through a series of exercises that discuss some of the input issues that you will face in writing a program to read a file of expressions and print the values of these expressions to cout.

Command Line Arguments
When you execute a program in most operating systems, you type the name of the program followed by a list of values called the "command line arguments." For example, in the Unix or DOS operating system, you might type the following to execute the mkdir program:

mkdir hw06

The name of the program is "mkdir", and there is one command line argument, "hw06". Other operating systems allow you to type commands along with command line arguments in other manners (such as the "Run" dialog box in various Windows operating systems).

When you write and compile a C++ program, the resulting program can also be executed by typing its name along with command line arguments. Your instructor may have to show you how to do this on your particular operating system. For example, you can write a program called Calc, compile the program, and execute it with the statement:

Calc minimize your therbligs

The running Calc program has access to the three strings "minimize", "your", and "therbligs". In order to obtain this access, you need to make a small change to the parameter list of the main function in the Calc program. This parameter list should be changed to:

int main(int argc, char *argv[ ])

As you can see, the main program has two parameters that it can access just like any other parameters. The meaning of these parameters is determined by the command line arguments in a manner that is described in the next few exercises.

Exercise 1: Using the Argc Parameter to the Main Function

The first parameter, argc, is the count of the number of command line arguments. For example, if we execute the command:

Calc minimize your therbligs

then calc's main program will have argc set to 4 (it counts the name of the program as one of the four "arguments").

For this exercise, write a complete main program called Calc.cxx. The main program should have the two parameters, as shown above. The body of the main program should simply print the number of arguments in a message such as "There are 4 Calc arguments!". Compile the main program into an executable file called Calc. Run the resulting Calc program with several different choices of command line parameters.

Exercise 2: Using the Argv Array

The main program's second argument, argv, has the declaration: char* argv[ ]. What does this mean? It means that argv is an array of pointers to characters. In other words, argv[0], argv[1], argv[2], ... are each a pointer to a character. In fact, each of these is a pointer to a dynamic array of characters. For example, suppose we execute our favorite command:

Calc minimize your therbligs

In this example, argv[1] will be a pointer to a dynamic array of characters that contains the first command line parameter "minimize". As you might guess, argv[2] will be a pointer to a dynamic array that contains a pointer to a dynamic array of characters that contains "your". And argv[3] will be a pointer to a dynamic array of characters that contains "therbligs".

And what about argv[0]? That contains a pointer to a dynamic array of characters that contains the actual command that was typed. In our example, argv[0] points to a dynamic array containing "Calc".

Each of the dynamic arrays argv[0], argv[1], ... can be treated just like any other string. For example, the first command line parameter can be printed to the screen by: cout << argv[1];

For your next exercise, modify your Calc program from Exercise 1 so that it also prints a copy of the command that was typed along with the command line parameters. For example, if you execute the command that we have been using in our examples, then the output of the Calc program will now be:

There are 4 Calc arguments!
Calc minimize your therbligs

Exercise 3: Checking That There is the Right Number of Arguments


Most programs require a certain number of command line parameters. If you provide the wrong number of command line parameters, then the program provides a "usage message" trying to tell you the correct way to use the program. Let's suppose that your Calc program is supposed to read input from an input file, and that Calc is always used with two command line arguments (the name Calc is the first argument, and the name of the input file is the second argument). If Calc is executed with more or less arguments, then you want a succinct error message to be printed such as "Usage: Calc input_file." This is a succinct message describing how Calc is supposed to be used.

For this exercise, add a new function to your Calc program, with this prototype

void validate_argc(int argc, int right, const char usage[ ]);

The function has three arguments. The first argument, argc, is the value of argc from the main program. The second argument, right, is the correct number of arguments that this program is supposed to have. The third argument, usage, is a string which is a suscinct usage message for the program. Here's what the function does: If argc is equal to right, then the function does nothing. But if argc differs from right, then the function prints the message and halts the program. Two points:

  1. Print the message to cerr, rather than to cout. (since cerr is the "standard error output").
  2. You can halt the program by calling a function exit(0).


Once the validate_argc function is added to your program, include a call to the function in your main program. The call should indicate that the right number of arguments is 2, and provide the usage message: "Usage: Calc input_file."

Reading From a File

By the way, you are headed toward a program that reads a file of arithmetic expressions as input. Eventually the program will evaluate all the expressions, and write the resulting answers to cout. Input from a file is easy in C++. Here are the steps:

  1. The program must #include <fstream.h>.

  2. The program needs to declare an ifstream variable that will "attach" to the input file. The data type of this variable is ifstream. For example, you could make the declaration of an ifstream variable called input, as shown here:

    ifstream input; // Will be attached to the input file

     

  3. The ifstream must be attached to the input file by activating the open function. The argument to the open function is an ordinary C string constant or string variable. For example, if input is an ifstream, then we can attach input to a file named foo with the function call:

    input.open("foo");

    The argument to the open function can also be a string variable-- i.e., an array of characters with the last character followed by the null character '\0'.

  4. You must check that the file was successfully opened. This check can be made by activating the fail function. This function returns a true/false value to indicate whether the ifstream has failed to properly attach to the input file (true indicates failure). I'd suggest something like this:
            if (input.fail( ))
            {
                cerr << "Could not open input file." << endl;
                exit(0); 
            }
            
  5. Once the ifstream input has been successfully opened, you may use the name input in the same way that you use other input devices (such as cin). For example, suppose that the next item in the input file is an integer, and i is an integer variable. Then you may execute the statement:

    input >> i; // Read an integer from the input file

    All the other familiar input functions (such as peek and ignore) can be used with the ifstream, in the same way that you have previously used these function with cin.

  6. Eventually you will read the end of the input file. At the end of the file there is usually a special character called EOF (which is not actually part of the data of the file). You can also use the name of the ifstream (such as input in our example) as a boolean expression which is true so long as the file has not contained any bad input (such as an alphabetic character when a digit is expected). Thus, the complete boolean expression to test whether there is more input reads like this:

    (input && input.peek( ) != EOF)

  7. When you are done reading from the input file, then you should activate the close function, as shown here for our example:

    input.close( );

Exercise 4: Three More Functions for the Calc Program

Write the following three functions, and add them to your Calc.cpp:

  1. A function that opens a file for input and checks that the opening process did not fail. The specification for this function is:

    void open_for_read(ifstream& f, const char filename[ ]);
    // Postcondition: The ifstream f has been opened for
    // reading using the given filename. If any errors
    // occurred then an error message has been printed
    // and the program has been halted.

  2. A boolean function that determines whether a specified file still has valid input. The specification is:

    bool is_more(istream& f);
    // Postcondition: The return value is true if f still has
    // more valid input to read; otherwise the return
    // valid is false.

    Notice that the data type of the argument is an "istream" rather than an "ifstream". An istream is usually an ifstream, although later you will learn of other kinds of istreams (such as cin). Istream arguments must always be reference parameters.

  3. A function named process_line with this prototype:

    void process_line(istream& f, bool& okay, double& answer);

    For now, you can implement a simple version of process_line. The simple version reads one input line, sets okay to true, and sets answer to 42.0. Later you will implement a more complex version of the function that actually evaluates an arithmetic expression from the input line. Anyway, put the simple version in your Calc.cpp program and change the body of the main function so that it does this:

    • Declares an ifstream variable called input and uses open_for_read to open input for reading. Use argv[1] as the name of the input file.
    • Have a loop that continues while there is more input. In the body of the loop, call process_line. If process_line sets okay to true then print the value of answer on the next output line; otherwise print the word INVALID on the next output line.
    • Close the input file and print a message saying that the program is done.

The Rest of This Programming Assignment


For the rest of this programming assignment, modify the process_line function so that it reads the next line of input (including the newline character at the end of the line). It treats this input line as an arithmetic expression. If the expression has an error (such as division by zero), then the function sets okay to false and leaves answer unchanged. Otherwise, the function sets okay to true and sets answer to the value of the expression.

Here are a few considerations:

Important Note: How to submit your assignment

We have told you many times about these measures but we have seen that you are not following these measures carefully. 
If you will not follow them you will loose marks. So read them carefully: